Bioinformatics (Thomas Dandekar, Meik Kunz)

276

19.13

As can be seen, these prediction programs are quite easy to use and provide a relatively

quick first insight into possible TFBS, such as unknown sequences, but usually show a

high abundance of predicted binding sites. In this context, it is important to know the exact

parameters of the individual programs in order to obtain meaningful results for further

experimental investigation. If one is careless and chooses, for example, a too high “dis

similarity rate”, I may get hits that are biologically none at all. Consequently, for further

investigations, the position with the lowest dissimilarity rate should always be selected for

the desired TFBS, i.e. the one with a high match to the search template (here for NF-AT2

e.g. position 632–640 with a dissimilarity rate of <5%). In any case, it is necessary to vali

date bioinformatically predicted TFBSs experimentally. Only then can I be sure that the

transcription factor found actually has an effect on gene expression, otherwise only the

DNA nucleotides of the prediction match (which is why I got a hit), but this has no biologi

cal relevance.

Finally, another option is to label the genome sequence, examine it with BLAST, and

thereby immediately identify the proteins it contains. For example, Psi-BLAST allows me

19 Tutorial: An Overview of Important Databases and Programs